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For multivariate data, Tukey's half-space depth is one of the most popular depth functions avail- 
able in the literature. It is conceptually simple and satisfies several desirable properties of depth 
functions. The Tukey median, the multivariate median associated with the half-space depth, is 
also a well-known measure of center for multivariate data with several interesting properties. 
In this article, we derive and investigate some interesting properties of half-space depth and its 
associated multivariate median. These properties, some of which are counterintuitive, have im- 
portant statistical consequences in multivariate analysis. We also investigate a natural extension 
of Tukey's half-space depth and the related median for probability distributions on any Banach 
space (which may be finite- or infinite-dimensional) and prove some results that demonstrate 
anomalous behavior of half-space depth in infinite-dimensional spaces. 
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1. Introduction 

Over the last three decades, data depth has emerged as a powerful concept leading to the 
generalization of many univariate statistical methods to the multivariate setup. A depth 
function measures the centrality of a point x with respect to a data set or a probability 
distribution and thus helps to define an ordering and a version of ranks for multivariate 
data. There are several notions of data depth available in the literature (see, e.g., [13- 
16, 21, 22]). Tukey's half-space depth (see [20]) is one of the most popular depth functions 
used by many researchers. The construction of central regions based on trimming (see, 
e.g., [17]), robust estimation of multivariate location (see, e.g., [6]), tests of multivariate 
statistical hypotheses (see, e.g., [2]) and supervised classification (see, e.g., [7, 7]) are 
some examples of its widespread application. 

Like other popular depth functions, half-space depth has some nice theoretical prop- 
erties. In fact, it satisfies all four of the desirable properties of depth functions first 
mentioned in [12] and subsequently investigated in [22], namely, affinc invariance, max- 
imality at the center, monotonicity with respect to the deepest point and vanishing at 
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infinity. Moreover, if the underlying population distribution F has a spherically symmet- 
ric density /, that is, /(x) = ^»(||x|| 2 ) for some ^:R+ — > R + , the half-space depth turns 

out to be a decreasing function of ||x|| 2 = {\x\ 2 H h Ixdl 2 ) 1 ' 2 . Consequently, when tp is 

monotonically decreasing (i.e., / is unimodal), the half-space depth becomes an increas- 
ing function of / and vice versa. Therefore, in such cases, the half-space depth contours 
coincide with the contours of the density function. Because of this property of the half- 
space depth, classification rules based on the ordering of the half-space depth functions 
coincide with the optimal Baycs classifier for discriminating among spherically symmetric 
unimodal populations differing in their centers of symmetry (see, e.g., [8]). Similarly, the 
use of the half-space depth functions to order and trim multivariate data sets (see, e.g., 
[6, 17]) leading to the determination of central and outlying observations has a natural 
justification when the density contours coincide with the half-space depth contours. Also, 
due to this relation between half-space depth and spherical symmetry, half-space depth 
has been used to construct diagnostic tools for checking spherical symmetry of a data 
cloud (see, e.g., [13], pages 809-811). Another well-known feature of half-space depth is 
its characterization property. Koshevoy [10] proved that if the half-space depth functions 
of two atomic measures with finite support arc identical, then the measures arc also 
identical. Cuesta-Albertosa and Nicto-Rcyes [4] proved this characterization property of 
Tukey depth for discrete distributions. Under some regularity conditions, Koshevoy [11] 
proved this characterization property for absolutely continuous probability distributions 
with compact support in finite-dimensional spaces. Hassairi and Regaicg [9] generalized 
it to absolutely continuous distributions with connected supports. 

However, the half-space depth function has several limitations. The half-space me- 
dian derived from half-space depth has a lower breakdown point and relative efficiency 
compared to the median based on projection depth (see [23]). Dang and Serfling [5] 
pointed out that the outlier identifier based on the half-space depth has a "severe" and 
"unacceptable" trade-off between "masking breakdown point" and "false positive rate" . 
Moreover, if the half-space depth contours fail to match the density contours, then the 
classifiers based on half-space depth may lead to misclassification rates higher than the 
Bayes risk. The diagnostic tool developed in [13], pages 809-811 for detecting deviations 
from spherical symmetry using half-space depth also relies heavily on the fact that un- 
der ^-symmetry, the depth contours are concentric spheres with half-space median at 
the center. So, in the absence of this property of the half-space depth contours, such 
a diagnostic tool may not lead to useful results. Now, a natural question that arises 
from this discussion is whether this property of half-space depth contours holds for other 
symmetric distributions, for example, in the case of ^-symmetric distributions, when 
/(x) = ^([[xHp) for some p ^ 2 and ip is monotonically decreasing. Here, for any p > 

and x = (xi, . . .,Xd) £ K d , we define ||x|| p = (|.ti| p H h \xd\ p ) 1 ^ p - In Section 2, we carry 

out an investigation to answer this question. 

For any continuous univariate distribution, it is straightforward to see that the me- 
dian is the point with half-space depth 0.5. In Section 3, we investigate to what extent 
this property of half-space median holds for multivariate continuous distributions and 
derive a characterization of the multivariate distribution for which the half-space depth 
of Tukey median will achieve its maximum value, namely 0.5. We propose a statistical 
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test for angular symmetry of continuous multivariate distributions based on this char- 
acterization and briefly study the performance of the proposed test. In this section, we 
also consider natural extensions of half-space depth and half-space median for probability 
distributions in arbitrary Banach spaces using the concept of linear functionals on such 
spaces. Some anomalous behaviors of half-space depth for probability distributions on 
infinite-dimensional spaces and their implications are discussed in Section 4. Proofs of 
theorems and lemmas (along with their statements) are deferred to the Appendix. 

2. Half-space depth contours for Z p -symmetric density 
functions 

In this section, we study the behavior of the half-space depth contours for a wide class 
of symmetric distributions. As was mentioned in the Introduction, the half-space depth 
contours coincide with the density contours if the p.d.f. / is such that /(x) =-0(||x||2) 
for some monotonically decreasing ip : R + — > R + , and this is an important feature of 
half-space depth with many useful statistical applications. Here, we will investigate the 
situation when || ■ H2 is replaced by || • || p , where p is positive and p^2. 

2.1. Depth contours for p = 00 

For p — 00, the p.d.f. /(x) = f(xi,X2,...,Xd) = ip(max{\xi\, \x2\,..., \xd\}) for some 
monotonically decreasing function ip. Clearly, the density contours here are concentric 
d-dimensional hypcrcubes with the origin at the center. We now check whether or not all 
points on the surface of a hypcrcube with origin at the center have the same depth. First, 
consider the point A = (1, 0, . . . , 0) on the surface of the unit hypercube {x : ||x||oo = 1} 
(see Figure 1 for a diagram in the case d = 2). It can be shown that the hypcrplane 
xi = 1 determines the half-space depth of this point, and this depth is P(X\ > 1), where 
X = (Xi, X2, . . . ,Xd) has the p.d.f. /(x) (see Lemma 1 in the Appendix). 

Note that the line x\ = 1 also passes through the point B = (1, 1, 0, . . ., 0) (see the right- 
hand diagram in Figure 1 when d = 2). So, A and B will have the same depth if and only if 
there exists no other hyperplane that passes though B in such a way that the probability 
of one of its half-spaces is smaller than P(X\ > 1). However, the hyperplane x\ + X2 = 2 
passes through the point B, and we can show that P(Xi + X2 > 2) < P{X\ > 1) (see 
Lemma 2 in the Appendix). This implies that if the p.d.f. / is of the form /(x) = ^(Hx^) 
with a monotonically decreasing ip, then the half-space depth contours cannot coincide 
with the corresponding density contours. 

2.2. Depth contours for 1 < p < 00 

Next, consider the case where 1 < p < 00. Clearly, A = (2 1 /p C , 0, 0, . . . , 0) and B = (c, c, 
0, . . . ,0) are two points on the same l p contour (see Figure 2 for the case d = 2). First, 
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we check whether or not the half-space depths of these two points are equal. In view of 
Lemma 1, the depth of A is given by P(Xx > 2 1 /p C ) when c> 0. We can also prove that 
the hyperplane x\ + xi = 2c determines the half-space depth of B and that this depth is 
P(X\ + X2 > 2c) (see Lemma 3 in the Appendix). 

It follows from the discussion in the preceding paragraph that the two points A and B 
will have the same depth only if P{X\ > 2 1 / p c) = P(X\ +X2 > 2c). Note that here we can 
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choose c arbitrarily. Therefore, the depth and the density contours can coincide only if 
P(X 1 > 2 l ' p c) = P(X 1 + X 2 > 2c) for all values of c, that is, only if X 1 and 2 a (X 1 + X 2 ) 
are identically distributed for a = (1 — p) /p. Now, if we assume the existence of the second 
order moments of the Xi's, then the equality of the variances of X\ and 2 a {Xi + X 2 ) and 
the fact that X\ and X 2 are uncorrelatcd (in view of the Zp-symmetry of the density /) 
imply that a = — 1/2 or p = 2. Even if we do not assume any moment condition, the above 
result holds (see Lemma 4 in the Appendix). Also, it is interesting to note that for p < 2, 
we can always choose a c such that the depth of B is more than that of A. On the other 
hand, for p > 2, it is always possible to choose a c such that A has larger depth than B. 

2.3. Depth contours for p < 1 

Finally, we investigate the case p < 1 . Note that in this case, the regions bounded by l p 
contours are no longer convex sets (see Figure 3 for the case d = 2). Consider three points 
A = (1, 0, . . ., 0), B = (0, 1, 0, . . ., 0) and C = (a,/3,0, . . . , 0) on the same l p contour, where 
a, P > and \a\ p + \(3\ p = 1. Consider any hypcrplanc passing through C. It will split R d 
into two half-spaces, one of which will contain the origin. Since p < 1, at least one of the 
two points A and B will lie in the half-space that does not contain the origin. Without 
loss of generality, we can assume that the hyperplane that determines the half-space 
depth of C puts B and the origin in two different half-spaces (see the bold line in Figure 
3 for the case d = 2). We can now make a parallel shift of that hyperplane away from the 
origin until it hits the point B (see the dotted line in Figure 3 for the case d = 2). Clearly, 
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Figure 3. l p contour for the case p = 1/2. 
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Figure 4. Density contours and their corresponding half-space depth contours. 



the half-space created by this new hyperplane that has smaller probability measure will 
have smaller probability than that of each of the two half-spaces created by the older 
hyperplane. Therefore, the half-space depth of B has to be smaller than that of C and 
hence the depth contours cannot coincide with the density contours. 

Summarizing our discussion in this section, we now have the following theorem. 

Theorem 1. Consider a probability distribution on M. d with the p.d.f. f such that /(x) = 
t/>(||x|| p ) for some monotonically decreasing function tp. The half-space depth contours 
associated with f will then coincide with the density contours if and only if p = 2 . 

Figure 4 presents the empirical half-space depth contours (indicated using connected 
lines) computed using 500 observations from bivariate ^-symmetric distributions with 
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different values of p (i.e., p = 1/2, 1, 2, 5). In each case, we consider the density to be of the 
form /(x) = ^ 2r ^2 P ^ exp(— + |^2| p }) and the corresponding density contours arc 
also plotted (indicated using dotted lines) in Figure 4. From this figure, it is quite evident 
that the half-space depth contours and the density contours are markedly different when 
Py^2. So, unlike what was done by [13], pages 809-811, we cannot develop a diagnostic 
tool for checking /^-symmetry using half-space depth when p^=2. 

It is also of interest to note that along with p = 2, for p = 1 and 5, the half-space depth 
contours are nearly circular. Since the diagnostic tool for spherical symmetry proposed 
in [13], pages 809-811, relies heavily on the sphericity of the depth contours, it may 
fail to detect the deviation from spherical symmetry in the cases p = 1 and 5. But for 
p = 1/2, since the depth contours are far from being circular, we can expect to detect this 
deviation using their diagnostic tool. This is what we observed when wc performed the 
following experiment. Following [13], pages 809-811, for different values of q (0 < q < 1), 
we found the smallest sphere S q containing the qth central hull and computed the fraction 
of the data r(q) lying in S q . This fraction r(q) is plotted against q for four different l p - 
symmetric distributions with p = 1/2, 1,2 and 5, and these plots are presented in Figure 
5. Note that if the underlying distribution is spherically symmetric (i.e., ^-symmetric), 
the resulting curve should lie near the diagonal line joining the points (0,0) and (1,1). 
The area between the curve and the diagonal line gives an indication of the deviation 
from spherical symmetry. As expected, for p= 1,2 and 5, these curves were close to 
the diagonal line, but in the case p = 1/2, the curve had a significant deviation from 
the diagonal line (see Figure 5). So, the diagnostic tool could detect the deviation from 
spherical symmetry only in the case of ^/2-symmctry. 

We have seen that the half-space depth contours do not match the density contours 
for any Zp-symmetric distribution with p^2. and this leads to several limitations on 
statistical tools based on half-space depth, as was already discussed in the Introduction 
and the present section. However, it will be appropriate to note here that in such cases, 
the depth function may provide some useful information which may not be contained 
in the density function. While density is only a local measure, which measures the local 
probability mass, depth is a global measure, which gives useful information about global 
features like the central and outlying points of a data cloud or probability distribution. 
For instance, in the case of multivariate uniform distributions, the density function, 
being constant, fails to give any idea about the central and the peripheral points of 
the distribution; however, the half-space depth function provides a meaningful measure 
of central tendency, for example, by identifying the point with the maximum depth 
(see [18]). 

3. Half-space median and its depth 

As we have already pointed out in the Introduction, for continuous univariate distribu- 
tions, the median is the point with half-space depth 0.5. In a sense, this is a very desirable 
and natural property for a measure of the center of a distribution, and we would also like 
this property to hold in a multivariate setup. If this property holds for a multivariate 
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Figure 5. Diagnostic tool for checking spherical symmetry. 



distribution, any hypcrplane passing through the median will lead to two half-spaces 
having equal probability measures. Unfortunately, as we will gradually see in this sec- 
tion, this may not always be true for multivariate distributions, even if the distribution 
is absolutely continuous with respect to the Lebesgue measure on a Euclidean space. 

Note that for any /^-symmetric density function /(x) = i/>(||x|| p ) with <p < oo, the 
origin turns out to be the half-space median with the half-space depth 0.5. In fact, 
this is true whenever X and —X have the same distribution (i.e., the distribution is 
centrally symmetric) , or even under a slightly weaker condition that any real- valued linear 
projection has median zero. We should also note that in all these cases, the half-space 
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median coincides with the coordinatewise median, and the depth of the half-space median, 
namely the origin, is 0.5. However, this only holds for a special class of multivariate 
distributions. For instance, for a bivariate uniform distribution on a right-angled isosceles 
triangle, we can easily show that the half-space depth of any point is smaller than 0.5. We 
can consider another interesting example of a continuous bivariate distribution, where 
the p.d.f. / has support on {(xi,X2) :x\ +x% > 0,x 1 X2 < 0}. In this case, if / is symmetric 
about the x\ = X2 line, we can easily verify that the half-space median will have depth 
smaller than 0.5, and the coordinatewise median will have zero half-space depth. We have 
already indicated some sufficient conditions for the depth of the half-space median to be 
0.5, and in view of the two preceding examples, we would like to know some necessary 
and sufficient conditions for this. We now state a theorem, the proof of which is given in 
the Appendix. 

Theorem 2. Suppose that X is a d- dimensional random vector with a probability distri- 
bution which has its half-space median at fi £ R d . Then, the half-space depth of /j, will be 
0.5 if and only if (X — /x) / 1 1 X — A* [ 1 2 ond (fA — X)/||X — A-*- 1 1 2 are identically distributed. 

This theorem implies that the half-space median will have depth 0.5 if and only if the 
underlying distribution is angularly symmetric. Liu et al. [13], pages 811-814, stated the 
sufficient part of this result and used it to develop a diagnostic tool for verification of 
angular symmetry of a distribution. This necessary and sufficient condition can also be 
used to develop a statistical test for the angular symmetry of a distribution. As discussed 
in [19], Ajne's test (see [1]), which is a distribution- free test for bivariate data, can be 
used for testing angular symmetry of a bivariate distribution about a specified point 
(say, fi a ). However, the test that we propose here is applicable to multivariate data in 
any dimension and does not require any specification of the center of symmetry, which 
is estimated from the data. Given a random sample xi,X2, . . . ,x„ of size n, let m ra be 
the half-space median and A„ denote the half-space depth of m„ in that sample. For 
testing the null hypothesis of angular symmetry, an ideal procedure would be to reject 
the null hypothesis if A„ < c„ , where c„ is an appropriate percentile (that depends on the 
specified level of the test) of the distribution of A„ under the null hypothesis. However, 
it is not possible to determine an exact value of c„ in practice because the distribution 
of A„ depends on the underlying angularly symmetric distribution of the data, which is 
usually not specified in practice. 

In practice, we propose that for a random sample xi, x 2 , . . . , x„, we first compute y, = 
Xi — m„ for i = 1, 2, . . . , n, generate i.i.d. observations z\, Z2, ■ ■ ■ , z n such that P(zi = 1) = 
P(zi = —1) = 1/2 and then compute x* = z^y,; + rh„ for i = 1,2, ... ,n. This procedure 
is motivated by the well-known idea of bootstrapping. These x*'s can be viewed like 
a "bootstrap sample" generated from the original sample under the null hypothesis of 
symmetry, and we can calculate the depth A* of the half-space median rh* based on 
that "bootstrap sample" . We can repeat this "bootstrap procedure" M times depending 
on our computing resources and denote by A* m the half-space depth of the half-space 
median in the mth "bootstrap sample" (m =1,2,..., M ). The critical value c n mentioned 
earlier can then be estimated from the "bootstrap empirical distribution" of A* . In other 
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dl Data sets -> Dl D2 D3 D4 D5 D6 

Nominal ->• ~T% 5%~ ~I% 5%T 1% 5%~ ~T% 5%~ ~I% 5^T "1% 5% 
level (a) 



2 


n 


= 50 


0.012 


0.052 


0.012 


0.054 


0.010 


0.044 


0.170 


0.318 


0.406 


0.663 


0.247 


0.418 




n 


= 100 


0.014 


0.054 


0.014 


0.053 


0.010 


0.058 


0.486 


0.728 


0.870 


0.960 


0.641 


0.846 


3 


n 


= 50 


0.011 


0.044 


0.003 


0.035 


0.015 


0.057 


0.294 


0.554 


0.751 


0.869 


0.403 


0.662 




n 


= 100 


0.009 


0.051 


0.006 


0.040 


0.012 


0.046 


0.822 


0.949 


0.996 


1.000 


0.929 


0.982 


4 


n 


= 50 


0.009 


0.054 


0.013 


0.061 


0.014 


0.067 


0.355 


0.719 


0.812 


0.955 


0.440 


0.824 




n 


= 100 


0.008 


0.043 


0.009 


0.046 


0.012 


0.050 


0.946 


0.987 


1.000 


1.000 


0.984 


0.997 



words, for a specified level < a < 1, the null hypothesis of angular symmetry is to be 
rejected if J{A^„ < A n }/M < a. 

To evaluate the performance of our proposed test, we carried out a thorough simulation 
study with six examples using the software package R. In each case, we generated samples 
of size 50 and 100, implemented our test using M = 1000 "bootstrap samples" and, in 
order to estimate the probability of rejection of Hq by the test, repeatedly applied it on 
1000 Monte Carlo replications in dimensions d = 2, 3 and 4. The first five examples were 
motivated by five bivariate examples in [13], page 814, which include three examples with 
angularly symmetric distributions, namely Dl, D2 and D3, and two examples, namely D4 
and D5, where the underlying distributions were not angularly symmetric ([13], page 814, 
for a detailed description of these examples). Here, we consider the natural multivariate 
version of these five examples. In the last example, D6, which is also not angularly 
symmetric, when d = 2, we generated observations from a bivariate uniform distribution 
on the right-angled isosceles triangle formed by the points (0,0), (1,0) and (0,1). For 
an extension of D6 in dimensions d > 2 , we have considered the simplex formed by the 

origin, the coordinate axes and the hyperplane x% H h Xd = 1 in M. d in place of the 

triangle. Table 1 reports the proportion of cases, out of 1000 Monte Carlo replications, 
where the null hypothesis was rejected for two nominal values of a, namely, 0.05 and 
0.01. This table clearly shows good level as well as power properties of the proposed test 
procedure. 

Note that the condition that (X — /x)/||X — /x 1 1 2 and (fi — X)/||/i — X||2 are identically 
distributed is sufficient for the half-space median to have half-space depth 0.5, even 
when X lies in an arbitrary Banach space £>, where || • || denotes the norm in B. If F is a 
probability distribution over B, and x is a fixed clement in £>, then the half-space depth 
of x can be defined as HD(x, F) — inf/ ie s» P{h(X. — x) > 0}, where h : B — > M. is a linear 
functional that belongs to the dual space B* , P stands for the probability measure on B 
corresponding to F, and X is a random clement in B having the distribution F. The point 
H € B is called a half-space median if HD(/x,F) = sup xgB HD(x, F). Instead of Banach 
spaces, if we work with a Hilbert space due to the Riesz representation theorem and 
the reflexive nature of a Hilbert space, the half-space depth of an observation x 6 % can 



1430 



S. Dutta, A.K. Ghosh and P. Chaudhuri 



be defined as HD(x, F) = infhew -P{(h, (X — x)) > 0}, where (•,•) stands for the inner 
product defined on H. 

From the above discussion, it is clear that if we have a symmetric distribution in a 
Hilbcrt or Banach space, then the point of symmetry will achieve the maximum depth 
value 0.5, and it will be the half-space median. So, in a sense, the half-space median is 
well defined and behaves in a nice way, even in infinite-dimensional spaces for symmetric 
probability distributions. However, in infinite-dimensional spaces, even when we deal with 
nice symmetric distributions, the half-space depth function can exhibit some anomalous 
behavior, which we will see in the next section. 

4. Anomalous behavior of half-space depth in 
infinite-dimensional spaces 

We know that if we have a data cloud of n observations in a d-dimcnsional space, then the 
empirical depth of an observation lying outside the convex hull formed by the data cloud is 
zero. For d> n, since the Lebesgue measure of this convex hull is zero, we have zero depth 
for all points in a set of probability measure one whenever we have n i.i.d. observations 
from an absolutely continuous distribution in M. d . In fact, for any probability measure on 
an infinite-dimensional Banach space such that any finite-dimensional hyperplanc in that 
space has zero probability, the empirical half-space depth based on finitely many i.i.d. 
observations from that probability distribution will be zero almost everywhere. So, the 
empirical version of half-space depth does not carry any statistically useful information in 
such cases. Naturally, we would be curious to know what happens to the population depth 
function in such situations. The following theorem demonstrates that it is possible to have 
a nice symmetric probability distribution on the I2 space for which the population depth 
function takes positive values only on a set of probability measure zero. Recall that the I2 
space of real sequences consists of infinite sequences {x\,X2, ■ ■ •) such that Y^iLi x f < 00 • 

Theorem 3. Consider an infinite sequence of independent random variables X = 
(Xi,X 2 ,X 3 , . . .), where E(Xi) = and E(Xf) = af for all i>l such that Y^Li a 1 < 00 • 
Note that this implies that X lies in the I2 space of real sequences with probability one. 
Also, assume that the Xi 's have finite fourth moments and that E{Xf)/i 2 af < 00. 

For instance, all these conditions will hold if the Xi 's are independent Gaussian random 
variables. Then, for any given x = (x\,X2, . . .) in that l 2 space, the half-space depth of x 
with respect to the distribution o/X will be zero unless x lies in a subset having probability 
zero. 

The proof of this theorem is given in the Appendix. This theorem clearly shows that 
not only the empirical version, but also the population version of the half-space depth will 
exhibit anomalous behavior for some very common distributions in infinite dimensions. 
Since any separable Hilbcrt space is isometrically isomorphic to the I2 space in view of the 
existence of a countable orthonormal basis in such a space, similar examples can also be 
constructed on separable Hilbert spaces. Clearly, the half-space depth function will not be 
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a very useful statistical concept in such spaces. To conclude, let us recall the property of 
half-space depth characterizing the underlying distribution established by earlier authors 
that was discussed in the Introduction. From the above discussion, it is clear that in a 
separable Hilbert space, there exist several probability measures, which may even have 
independent Gaussian marginals, with half-space depth functions identically equal to 
zero except on a subset having zero probability measure. Nevertheless, such symmetric 
probability measures will have a well-defined half-space median that achieves the depth 
value 0.5. 

Appendix 

Lemma 1. Let HD(x, F) be the half-space depth of x with respect to the distribution 
F, and F have density f of the form /(x) = i/>(||x|| p ) with a monotonically decreasing 
function ijj and < p < oo. Then, for any x = (x, 0, . . . , 0) on the coordinate axis, we have 
HD(x,P) = P(X 1 > x) when x>0, and HD(x,F) = P(X 1 < x) when x < 0. 

Proof. We will prove it for Xo = (1, 0, . . . , 0). Proof for other points follows in the same 
way. Consider any hypcrplane cc(x — xo)' = other than x\ = 1 that passes through xo 
(see the left-hand diagram in Figure 1 for the case d = 2). Here, a = (ai , a 2 , ■ ■ ■ , ad) is a 
vector in M. d . Define the regions A\ = {x : x\ < 1 and a(x — xo)' > 0} and A 2 — {x : x\ > 
1 and a(x — xo)' < 0} (see the left-hand diagram in Figure 1 for the case d= 2). To 
prove the lemma, we have to show that P(X G A\) > P(X G A2). Define A3 = {x = 
(xi,X2, ■ ■ ■ ,Xd) '■ {x±, —X2, —X3, . . . , — Xd) G A2}. Because of the symmetry of /, it is easy 
to check that P(X G A 2 ) = P(X G A 3 ). Therefore, it is enough to prove that P(X G A{) > 
P(X G As). Note that for every point z = (xi,X2, ■ ■ ■ ,Xd) in A\, we have a point z' = 
(x' l5 X2, X3, . . . , Xd) in A3 such that x[ = 2x\ — 1. Hence, |xi| < |x^| and ||z|| p < ||z'|| p with 
strict inequality being true for all z not lying on the hyperplane x\ = 1 . This implies that 
/( z ) •> f( z ')- Since the strict inequality holds over a set of positive measure, integrating 
/(z) (resp. f(z')) with respect to z (resp. z'), we actually get P(X G A\) > P(X G A3). □ 

Lemma 2. Consider a p.d.f. f on M. d satisfying /(x) =-0(11x1100) and a random vector 
X with p.d.f. f. Then, for any x > 0, we have P{X\ + X2 > 2x) < P(X\ > x). 

Proof. Again, we will prove this only for x = 1. Let us define A\ = {x = (xi, X2, . . . , Xd) : 
xi < 1 and xi + X2 > 2} and A2 = {x = (xi, X2, ■ • ■ , Xd) : x± > 1 and x\ + X2 < 2} (these 
two regions are shown in the right-hand diagram in Figure 1 for the case d = 2). We 
also define the region A3 = {x = (xi, X2, . . . , Xd) ■ (x2, x±, X3, . . . , Xd) G A\}. Because of the 
symmetry of /(x) under permutations of the coordinates of x, it is straightforward to sec 
that P(X G At) = P(X G A 3 ). Hence, it is enough to show that P(X G A3) < P(X G A 2 ). 
Now, for any z = (z\, z%, ■ ■ ■ , Zd) G A2, we have a corresponding point z' = (2 — Z2, 2 — 
zi, Z3, . . . , Zd) in A3. Also, note that for any z = (zi, Z2, . . . , Zd) in A2, Zi and Z2 have 
the respective forms z± = 1 + b and Z2 = 1 — 6 — a for some a, 6 > (see the right-hand 
diagram in Figure 1 for the case d = 2). Consequently, for z' = (z[, z' 2 , Z3, . . . , Z4), we have 
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z[ = 1 + b + a and z' 2 = 1 — b. Clearly, max{|zi|, \z 2 \} < max{|z^|, \z' 2 \} = 1 + a + b, which 
implies that ||z||oo < || z ' I |oo and hence that /(z) > /(z') with strict inequality on a set of 
positive probability measure under /. This proves that P(X G A 2 ) > P(X e A 3 ). □ 

Lemma 3. Let /(x) = ?/>(||x||j,) for 1 < p < oo be the p.d.f. of X = (Xi,X2, ■ ■ ■ , -^d). 
Consider xo = (c, c, 0, . . . , 0) for c > 0. Its half-space depth is then given by HD(xo, F) = 
P(X 1 +X 2 >2c). 

Proof. Consider the hypcrplanc x\ + x 2 = 2c (see Figure 2 for the case d = 2). We 
have to show that this hyperplane determines the half-space depth of Xo. For this, 
we will follow the same lines of argument as in Lemmas 1 and 2. Consider a new 
hyperplane cc(x — xo)' = passing through xo (see Figure 2 for the case d = 2). 
Define the regions Ai = {x = (x±, x 2 , ■ ■ ■ , Xd) ■ x\ + x 2 < 2c and a(x — xo)' > 0} and 
A 2 = {x = (xi 7 x 2 , . . .,Xd) ■ x\ + x 2 > 2c and c*(x — x )' < 0} (see Figure 2 for the case 
d = 2). To prove the lemma, we have to show that P(X E Ai) > P(X € A 2 ). Define 
A3 = {x = (xi, x 2 , . . . , Xd) ■ (x 2 ,Xi,X3, . . . , Xd) £ ^2}. Because of the symmetry of /(x) 
under any permutation of the coordinates of x, we have P(X £ A 2 ) = P(X £ A3). There- 
fore, it is enough to show that P(X 6 A 3 ) < P(X e Ax). 

Note that any point z S A\ is of the form z = (c + a, c — a — k, x 3 , . . . , Xd), where k > 0, 
and a can be positive or negative (see Figure 2 for the case d = 2). For any z S Ai, we get 
a corresponding point z' S A3 such that z' = (c + a + k, c — a, x 3 , . . . , Xd)- We now need 
to show that ||z|| p < ||z'|| p and for that, we will consider the two cases a > and a < 
separately. 

When a > (see the left-hand diagram in Figure 2 for the case d = 2), we have < 
|c — a\ < \c + a\. Now, for p > 1 and t,k > 0, it is easy to check that the function h(t) = 
(t + k) p — t p is non-decreasing in t. So, for < t\ < t 2 , we have < h(ti) < h(t 2 ). Taking 
t x = \c-a\ and t 2 = \c + a\, we get (|c - a\ + k) p - \c - a\ p < (|c + a\ + k) p - \c + a\ p . Now, 
using the facts that \c + a\ + k = \c + a + k\ and |c — a — fc| < |c — a| + fc, we arrive at 
|c - a - k\ p - |c - a[P < |c + a + k\ p - \c+ a\ p . This implies that \c-a- k\ p + \c + a\ p < 
\c + a + k\ p + \c — a\ p , which in turn implies that ||z|| p < ||z'|| p . Note that the strict 
inequality holds on a set of positive probability measure under /. 

For a < (see the right-hand diagram in Figure 2 in the case d = 2), first note that 
a + k > and that the coordinates of z and z' are of the respective forms z = (c — a, c — 
P, x 3 , ... , Xd) and z' = (c+cv, c + /3,X3, . . . , Xd), where a = —a > and /3 = a + k > 0. Now, 
|c- a\ < \c + a\ and |c - /3| < |c + /3| imply that ||z|| p < ||z'|| p . □ 

Lemma 4. Assume i/iai we have a p.d.f. f that satisfies /(x) = ^)(||x|| p ) for some p > 
and monotonically decreasing tp. Let X = (X\, X 2 , . . . , Xd) be a random vector with p.d.f. 
f. If Xi and 2( 1 ~ p ^ p (Xi + X 2 ) are identically distributed, then we must have p = 2. 

Proof. First, note that if /(x) = ^(||x|| p ), then the joint p.d.f. of X\ and X 2 is of the 
form fi(xi, x 2 ) — -0i (\x\ \ p + \x 2 \ p ) for some ipi : R+ — > R+. We can show that the p.d.f. 's 
of Xi and Y = 2 a (Xi + X 2 ), where a = (1 — p)/p, are given by fx ± (x) = J ipi((\x\ p + 
\x 2 \ p y/ p )dx 2 and fy(x) = 2" Q / ^i((|2- Q a; - x 2 \ p + \x 2 \ p )^ p ) dx 2 , respectively. 
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Since both of these p.d.f.'s are continuous functions, and X\ and Y are identi- 
cally distributed, we can equate their values at x = 0. We then get f i/ji(\x 2 \) dx 2 = 
2~ a f ip 1 (2 1 /P\x 2 \)dx 2 = 2- ( " +1 /p) / ^ 1 (\x 2 \)dx 2 . Hence, we must have a = -l/p, which 
implies p = 2. □ 

Proof of Theorem 2. Note that the "if" part is trivial in view of our discussion pre- 
ceding the statement of the theorem. We shall now prove the "only if" part. 

First, we shall prove it for the bivariate case, that is, d = 2. Without loss of generality, 
we assume that fi = 0. Let Z be the angle between the positive side of the xi-axis and 
the random vector X (measured counterclockwise from the rri-axis). Now, consider a 
straight line which passes through the origin and makes an angle 8 with the xi-axis. Since 
/x = 0, the two half-spaces generated by that straight line will have the same probability 
measure. Now, rotate the line in a counterclockwise direction by an angle 5 to bring it 
to a new position. Clearly, the two half-spaces generated by the straight line in the new 
position will also have the same probability 0.5. This implies that P(0 < Z < + S) = 
P(n + 6<Z<n + 9 + 8). Since this equality holds for all 9 and 6, it implies that Z and 
Z + 7t have the same probability distribution. The result now follows from the fact that 
(X - aO/||X - n\\ 2 = (CosZ, SinZ) and (/z - X)/||X - n\\ 2 = (Cos(Z + tt), Sin(Z + tt)). 

For d > 2, we need to consider d — 1 random angles Z\, Z 2) ■ ■ . ,Zd-i- Note that here 
the direction vector (X — ^i)/||X — MII2 can be expressed as (X — /Lt)/||X — /x||2= (CosZi, 
SinZi Cos Z 2 , ■ ■ ■ , SinZi • • • Sin .2^-2 Cos-Z^-i, SinZi • • • Sin 2^-2 SinZ^-i). Now, consider 
a hyperplane H which makes angles #i,#2, ■ • ■ with the coordinate axes and then 

rotate it to Hi such that the new angles are Q\ + 6,62, ■■■ , 0<i-i- The result now follows 
from the same argument that is used in the bivariate case. □ 

Lemma 5. For any two sequences cr = (ci,...) and x= (xi,x 2 , ■ ■ •) in the I2 space 
of real sequences, we have sup ceg ; 2 {(^^ 1 af of )~ l72 (Ei*li a i x i)} < co if and only if 

Eoo 2 / 2 ^ 

Proof. (The "if" part). For any a e l 2 , EZi (E"i <&i) 1/2 (lZi Al^f' 2 , 

(i.e., the Cauchy-Schwarz inequality) implies that ESa a i x i/Cl2iLi a i a i) 1 ^ 2 ^ E£i x i/ a f- 
Now, the right-hand side of the inequality does not depend on a. So, Y^Li x i/ ,7 i < 00 
implies the finiteness of sup^-fESi a tXl /{Y°°=i a2(j2 ) 1/2 } < E£i 

(The "only if" part). Next, consider the case where Ei*li I l<J 1 = 00 • Choose a se- 
quence {a. n } of real sequences, where a n = (a n i, a n2 , ■ • ■) has non-zero values only at first 
n coordinates (i.e., a n i = for all i > n) and a n i = Xi/of for i = 1, 2, . . . ,n. Clearly, a n s 
l 2 for all 71 > 1, and for each n, it is easy to check that E"=i a m x i/(J27=i °^m a2 Y^ 2 = 
(ES=i s?A»i ) 1/a - So, we get sup„> 1 {^™ =1 ^^/(^Li afti** ) 1/2 } = 00. This clearly im- 
plies that we have sup^^-fE^a o^/(E^i a2(j2 ) 1/2 } = °°- D 

Proof of Theorem 3. Consider any x in the l 2 space with x/ 0. For any a in 

the I2 space, the random variable Z= (ct,X) has a probability distribution with 
E{Z) = and V(Z) = ESiofof- Usin S Chebyshev's inequality, we get P((ct, (X - 
x)) > 0) = P(Z > (a,x)) < Et^i"f' j2 /(E l =ia^0 2 - So, the depth of x is bounded 
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above by inf a £i 2 {X^i a ?°f /(SSi ctiXi) 2 }. From Lemma 5, it follows that this up- 
per bound is zero when Y^i=\ x 'll' J 'i = 00 • Therefore, x will have positive depth only if 

Next, consider Yi = Xf /of for i > 1. The Ij's are then independent random variables 
with a common mean 1 and Y^hLi )A < oo. So, using the strong law of large num- 
bers (see Theorem 1 in [3], page 124), we have n" 1 Y^i=i 1 as n — > oo. Consequently, 
Si^i = Xi/aj = oo with probability one. □ 
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